专利摘要:
An information processing apparatus has an acquisition unit (201) configured to acquire an input image, and an extraction unit (205) configured to extract from the input image into a corresponding area. a part having the predetermined surface in the background in space, an object area corresponding to an object being on the side closer to a foreground than the surface, using a first reference, and to extract, from the input image, in an area corresponding to a portion not having the surface in the background in space, an object area corresponding to the object using a second different reference of the first reference.
公开号:FR3023629A1
申请号:FR1556329
申请日:2015-07-03
公开日:2016-01-15
发明作者:Yuki Kamamori
申请人:Canon Inc;
IPC主号:
专利说明:

[0001] BACKGROUND OF THE INVENTION [0001] The present invention relates to a technique for the detection of an object from an image, a method for controlling the apparatus, and a storage medium. BACKGROUND OF THE INVENTION to identify an object in a background and a moving object related to gesture or other operation based on distance information.
[0002] Description of Related Art A system for performing an operation on an object displayed or placed on an array such as a whiteboard and a table using a user interface (UI) that provides an input to the object. a gesture has been proposed.  In such a system, it is preferable that, in a two-dimensional space parallel to the table, a difference between an operation zone in which the system can recognize an operation by means of a gesture, a tactile touch, or other, and the size and shape of the painting, is small.  In other words, it can be expected that an operation can be performed in almost the entire area of the table.  In the above-mentioned system, in many cases, in a state in which a flat portion of a predetermined table is facing away from the front face, an image having the table at a certain angle of view is captured, and from the captured image, an object moving above the table is detected to recognize an operation performed by the object.  In an area in which a fixed array is part of a background, using known background subtraction, the area with the object above the array can be easily extracted.  However, in a zone (if the array is a table, the periphery of the table) not having the array in the background, compared to the area above the array, the positions and the number of objects in the background are unstable.  Therefore, it is not always appropriate to use background subtraction as for object detection above the table.  [0004] Japanese Patent Application Laid Public Inspection No.  2007-64894 discloses, in a process for detecting an object from an image, the use of different object detection methods for respective partial areas according to predetermined factors.  In a system in which an operation zone and a predetermined array have a different size and shape, the object detection must be performed both in an area (above the array) having the array in the array. a background, and in an area that does not have the table in the background.  In the known art, the detection of an object lying above two areas having different background conditions is not fully taken into account.  SUMMARY OF THE INVENTION [0006] The object of the present invention is to perform accurate object recognition in a space having a plurality of areas having different background conditions.  [0007] In accordance with one aspect of the present invention, an information processing apparatus has an image acquisition unit configured to acquire an input image and, for each pixel of the input image, position information in a space having a predetermined area forming part of a background, and an extracting unit configured to extract from the acquired input image, in an area corresponding to a portion of the space comprising the surface in the background, an object area corresponding to an object in the foreground to the surface, based on position information of the object in the image of the object. acquired input, position information of the surface, and a first reference, and to extract from the acquired input image, in a zone 5 corresponding to a part of the space not having the surface in the background, an object area corresponding to the object is opening in the foreground relative to the surface, based on the position information of the object in the acquired input image, and a second reference different from the first reference.  Other features of the present invention will become apparent from the description given below of exemplary embodiments with reference to the accompanying drawings.  BRIEF DESCRIPTION OF THE DRAWINGS [0009] Fig. 1A and Fig. 1B are block diagrams respectively illustrating examples of a hardware configuration and a functional configuration of an information processing apparatus.  Figure 2 illustrates an example of an aspect of a system according to a first exemplary embodiment.  [0011] Fig. 3 is a flowchart illustrating an exemplary gesture recognition processing according to the first exemplary embodiment.  [0012] FIG. 4A, FIG. 4B and FIG. 4C are conceptual diagrams illustrating the detection of an object located at a higher position than a table top according to a known technique.  FIG. 5 is a flowchart illustrating an example of processing for extracting an object area in an area having a table.  FIG. 6A, FIG. 6B and FIG. 6C are conceptual diagrams illustrating the detection of an object 35 at a higher position than a table top.  FIG. 7 is a flow chart illustrating an example of a process for extracting an object zone in a zone that does not include a table.  FIG. 8A and FIG. 8B are flowcharts illustrating examples of the processing for determining a second height threshold value to be used in the processing performed in the non-table area.  FIG. 9 is a conceptual diagram illustrating a range of parameters to be used in a second threshold value correction process.  [0018] Fig. 10A and Fig. 10B are conceptual diagrams illustrating a process for correcting the second height threshold value.  FIG. 11A and FIG. 11B are conceptual diagrams illustrating a treatment intended to correct the second height threshold value.  [0020] FIG. 12A and FIG. 12B are conceptual diagrams illustrating a process for correcting the second height threshold value.  Fig. 13 is a flowchart illustrating an example of gesture recognition processing according to a second exemplary embodiment.  Fig. 14 is a flowchart illustrating an example of processing for determining the second height threshold value.  FIG. 15 is a diagram illustrating an example of the flowchart illustrating the process for determining the second height threshold value of the second exemplary embodiment.  DESCRIPTION OF THE EMBODIMENTS [0024] An information processing according to the exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings.  The configuration described in the exemplary embodiments is given by way of example only, and the scope of the present invention is not limited to said configurations.  As a first exemplary embodiment of the present invention, there is described an example of a table top interface (I / F) system enabling a user to perform a gesture operation on a graphical user interface (GUI) projected on a tabletop.  In the following examples of embodiments, a predetermined table contained in a background in a space in which a gesture is to be performed is referred to as a "table (table top)", but instead of the table (in this case, the table refers to a table as an object), a vertically arranged screen or whiteboard can be used.  In the present description, the table (the table top), the screen, the whiteboard, etc.  are referred to as the "table", however, the table is not limited to a plate-like object having no irregular portion on its surface.  The table referred to in this specification refers to a physical surface having a finite area, and the surface may be considered to be in a stable, stationary state.  An example of the object (hereinafter referred to as the operation object) to be used by a user to perform a gesture includes a hand of the user.  In addition, other parts (e.g., fingers and legs) or an object to be held by the user (e.g., a stylus) may be used as the object of operation.  FIG. 1A is a block diagram illustrating a hardware configuration of an information processing apparatus 100 according to the first exemplary embodiment.  A central processing unit (CPU) 101, comprising a random access memory (RAM) 103 as a working memory, executes an operating system (OS) and programs stored in a read-only memory (ROM) 102 and a storage device 104, controls the components connected to a system bus 106, and performs a calculation and a logical determination in various types of processing.  The processing to be performed by the CPU 101 includes gesture recognition processing (information processing performed in a system for recognizing a gesture performed by a user as an operation input) illustrated in flowcharts described herein. -after.  The OS and the processing programs can be stored in the storage device 104, and in this case, the necessary information at the time of power-up is read in the RAM 103 as needed.  The storage device 104 is for example an external storage device connected via various types of interfaces, such as a hard disk drive, a network and a universal serial bus (USB).  In this exemplary embodiment, the storage device 104 stores, for example, digital data to be projected onto a table top, coordinate acquisition parameters acquired by calibration, and a background image acquired at the table. advanced.  A telemetry image sensor 105, in accordance with the control performed by the CPU 101, captures an image of a space having a table top on which a GUI is projected and displayed as part of a background, and further comprising a space having a peripheral area (hereinafter referred to as the table space) of the table by a method described hereinafter, for acquiring a telemetry image.  The telemetry image sensor 105 inputs the acquired telemetry image to the information processing apparatus 100 via the system bus 106.  Each pixel of the captured and acquired telemetry image is representative of a distance between a sensor and an object.  The telemetry image sensor 105 is set up so that, in a case where an operation object is in the table peripheral space, the operation object must be captured in a foreground ( area closer than the table) of the table that is part of the background.  In this exemplary embodiment, as a method for acquiring the telemetry image, a flight time method having a low impact on the ambient light and the display on the table top is described. envisaged use, a parallax method, a time of flight method and other methods may be used.  A display device 107 includes a display, a projector, or the like, for displaying an image such as a GUI and various types of information.  In this exemplary embodiment, a liquid crystal projector is used as a display device 107.  In this exemplary embodiment, the telemetry image sensor 105 and the display device 107 are external devices connected to the information processing apparatus 100 via an interface 108. for inputs and outputs, respectively.  These components can be integrated into the information processing apparatus 100.  FIG. 1B is a block diagram illustrating an example of a functional configuration of the information processing apparatus 100.  These functional units are implemented by the CPU 101 by loading the programs stored in the ROM 102 to the RAM 103, and executing a processing according to the flowcharts described hereinafter.  If hardware is to be configured instead of software processing using the CPU 101, computing units and circuits corresponding to the processing performed in the functional units described below are configured.  An image acquisition unit 201 acquires a telemetry image inputted from the telemetry image sensor 105 as an input image.  The input image is an image of a table peripheral space captured by the telemetry image sensor 105, and each pixel value is representative of distance information from the telemetry image sensor 105 to the image sensor. 'object.  In this exemplary embodiment, the telemetry image sensor 105 is installed to capture an image of the space having the table as part of a background.  The distance information contained in the pixel value of the telemetry image corresponds to position information in a direction intersecting with an upper surface of the table.  A height acquisition unit 202 acquires, on the basis of a position of the pixel of the input image and its pixel value, three-dimensional positional information of the object in space. table device.  The three-dimensional positional information that can be acquired by the pitch acquisition unit 202 includes, at least, height information of the object above the table top and information from the table ( foreground of the table top).  A border acquisition unit 203 acquires, on the basis of the three-dimensional position information acquired from the input image, a position of a boundary between an area comprising the table top in the background and an area that does not have a table top in the background in the table-top space.  In the description given below, the area with the table top in the background (a range of distances corresponding to a space above the table) is referred to as the first area, and the area not having a table top in the background (a range of distances corresponding to a space of the periphery of the table) is referred to as a second area.  A threshold determining unit 204 determines, in the second zone, a threshold value of a height to be used to extract from an input image a zone comprising an object located at a higher position than Table.  In this exemplary embodiment, a second pitch threshold value to be used in the extraction process for the second area is referred to as a height in a universal coordinate system, and is subjected to a processing of comparing with a height obtained by converting the size of the pixel value of the input image.  The threshold determining unit 204 corrects the second height threshold value in a case where an object area extracted from the first zone is in contact with a boundary between two zones.  An extraction unit 205 extracts, from each of the first zone and the second zone, an object zone comprising an object located at a position higher than the table on the basis of respective height references of different heights.  In this exemplary embodiment, in the first area, an object area is extracted by a threshold value processing with the first height threshold value, using a relative height relative to the table top as a reference height.  In the second area, an object area is extracted by a threshold value processing with the second height threshold value, using a height (height in the universal coordinate system) obtained by converting the size of the value of pixel of the input image.  A recognition unit 206 determines whether an entire object area in which the object area extracted from the first area and the object area extracted from the second area are combined, is a predetermined operation object. as a hand of a user, and recognizes a gesture operation performed by the operation object.  In this exemplary embodiment, since tactile contact of the operation object on the table top is recognized based on the three-dimensional position information of the operation object without using a touch sensor , the touch operation is considered to be a type of gesture operations.  A display control unit 207 generates a display image to be displayed on the table top 15 by the display device 107, and outputs the display image through the display. 108 interface.  Specifically, in this exemplary embodiment, the display control unit 207 generates a display image representing a response to an operation recognized by the recognition unit 206 based on the data stored in the device. 104 or other devices, and outputting the image to sharply specify for the user feedback on the operation.  [System Aspect] [0039] FIG. 2 illustrates an example of an aspect of a system to which the information processing apparatus 100 of the exemplary embodiment can be applied.  A liquid crystal projector, which is the display device 107, is disposed above a table top 301.  For example, a GUI screen 302 is projected on the table top 301.  The user uses an operation object such as a hand 303 to perform a touch operation to touch a GUI object contained in the GUI screen 302.  In the system illustrated in FIG. 2, an operation zone which is a range of distances in which the system can recognize an operation such as a gesture and a tactile touch corresponds to a range of distances above the table top. 301 on which a projection by the display device 107, can be performed.  In a positional relationship between the display device 107 and the table top 301, the display device 107 and the table top 301 are fixed so that a central portion of the table top 301 corresponds to a central portion of the display screen projected by the display device 107, and an image projection can be performed over a range of distances of 90% or more of a surface of the table top 301.  As long as a projection on the surface of the table top 301 can be performed, it is not always necessary to install a housing of the display device 107 above the table 301.  The telemetry image sensor 105 is also disposed above the table top 301, and acquires a captured telemetry image of a table peripheral space having the table top 301 as part of a background.  The information processing apparatus 100 detects the hand 303 from the telemetry image, and recognizes a user gesture made in the peripheral space of the table top 301.  In this exemplary embodiment, as illustrated in FIG. 2, x, y, and z coordinate axes are defined.  In Fig. 2, a two-dimensional plane parallel to an upper surface of the table is an x, y plane, and a direction orthogonal to the top surface of the table and extending upward is a normal direction of the z-axis. .  In each pixel of the telemetry image captured by the telemetry image sensor 105, a pixel value corresponding to a distance from the sensor to the object is recorded.  In this exemplary embodiment, as illustrated in FIG. 2, the telemetry image sensor 105 is set to include the table top 301 in its viewing angle, and the object being in an image of telemetry to be captured is between the telemetry image sensor 105 and the table top 301.  Therefore, the distance from the sensor to the object can be considered relatively as being information representative of the distance (height) from the surface of the table top 301.  However, in FIG. 2, since the telemetry image sensor 105 is set to have an angle with respect to the table top 301, the surface of the table top 301 is inclined with respect to an image sensing element of the telemetry image sensor 105 even if the distance provides a plane having a constant height with respect to the surface of the table top 301.  Therefore, in this exemplary embodiment, a coordinate transformation is performed on the distance information indicated by the individual pixel value in the telemetry image captured by the telemetry image sensor 105 based on characteristics. telemetry image sensor lenses 105 and the relative positional relationship with the table top 301 to project the coordinates in a xyz space.  Because of this transformation, with respect to each point in a space having the table top 301 as part of the background, xyz coordinates can be acquired.  In FIG. 2, the axis z, which constitutes the reference information of the height relative to the table top, is perpendicular to the table top but a certain degree of inclination can be admitted as long as the z axis is oriented in a direction intersecting with the table top.  Since the distance information in the z-axis direction can be acquired, the position of the telemetry image sensor housing 105 is not limited to the position above the table top 301.  In this exemplary embodiment, the table top is installed so as to be parallel to the ground, and the position information (z-coordinate) in a direction perpendicular to the table top is referred to as "height".  However, depending on the direction of the table, the term "height" does not always refer to a vertical direction.  In this exemplary embodiment, when a positional relationship between the telemetry image sensor 105 and the table top 301 is defined in advance, a telemetry image in which only the table is captured is acquired in advance as a basic background image.  The information processing apparatus 100 stores the acquired telemetry image information as a background background image in the storage device 104.  Acquisition of the background background image is performed on a regular basis, for example immediately after the operation of the information processing apparatus 100, or at regular time intervals.  In this exemplary embodiment, the table top 301 may be a table top having an integrated display having the function of displaying the GUI screen 302.  In addition to the above, a system comprising a visible light camera for capturing a visible light image of an object placed on the table top 301 can be configured.  As illustrated in FIG. 2, in this exemplary embodiment, the user reaches a hand 303 above the table in space from outside the peripheral space of the table top 301 so to perform an operation.  Therefore, for example when the user touches a GUI object displayed at a position near the edge of the table on the GUI screen, part of the hand 303 is more likely to be outside the edge of the GUI screen. table top 301.  In this case, according to telemetry 105, the hand from above the edge of the table.  For example, in this example of the background mode, the area that does not include the table in which there is no table in the background in the background. In the table peripheral space, different height references are used to extract an area with an object above the table.  The extracted areas are then combined to detect an entire image of the object on the boundary between the two areas.  Fig. 3 is a flowchart illustrating an example of the gesture recognition processing to be performed by the information processing apparatus 100 in this exemplary embodiment.  This processing is carried out by the CPU 101, which constitutes functional units in the information processing apparatus, by loading a program stored in the ROM 102 to the RAM 103 and executing the program.  In this exemplary embodiment, in response to execution of an application for recognizing a gesture operation, the telemetry image sensor 105 begins image acquisition of a telemetry image, and the Telemetry image is inputted to the information processing apparatus 100 at each predetermined period of time.  In response to inputting the captured telemetry image, the processing shown in Fig. 2 begins.  In a step S401, the threshold determining unit 204 determines a threshold value to be used in the detection process (step S408) of an object zone in the zone (second zone). with no table.  Referring to the flowchart shown in FIG. 8A, the processing for determining a threshold value in the second zone during a step S401 is described in detail.  In step S901, the border acquisition unit 203 acquires information of pixel values of a background background image that has been captured in advance.  In step S902, the boundary acquisition unit 203 detects pixels each having a difference greater than a predetermined value, i.e., the difference between the pixel value and that of an adjacent pixel.  When the difference between pixel values is large, it means that there is an extremely large height difference, and in this case, it corresponds to an edge portion of the table.  Therefore, a portion of continuous pixels, each of which has a difference between the pixel value and that of the adjacent pixel greater than the predetermined value, is used as the boundary between the area having the table in the background and the zone with no table.  The pixels used as the boundary are pixels (pixels with high pixel values) contained in the range of distances above the table.  In step S903, the threshold determining unit 204 acquires a value indicative of an average height based on the acquired pitch information by converting the pixel values of the pixels forming the boundary between the zones.  In step S904, the threshold determining unit 204 determines the acquired average height as a threshold value of the height in the second zone.  In this exemplary embodiment, the determined threshold value is stored in the RAM 103, and the stored value is used during the execution of the threshold value processing.  Returning to the flowchart shown in FIG. 3, in step S402, the image acquisition unit 201 acquires a telemetry image inputted by the telemetry image sensor 105 as In step S403, the extraction unit 205 extracts from the input image, in the zone (first zone) comprising the table in the background, an area comprising an object at a higher position than the table as an object area.  When an object is at a position higher than the table, it means that the object is in the foreground of the table constituting the background.  In this exemplary embodiment, in the area having the table in the background, the object area is detected by performing a processing using a relative height relative to the table surface as a reference height. .  Specifically, an area in which a relative height relative to the table surface is greater than a predetermined threshold value (first height threshold value) is extracted as an object area.  The height of the table and the height of the object are information detected by the same system, so that, using their relative values, a detection affected by a reduced coordinate transformation error and reduced effects due to the detection error in the system can be performed.  The processing shown in step S403 is described below in detail.  In step S404, the boundary acquisition unit 203 acquires coordinates of a portion of the boundary between the first and second areas including the object area extracted at step S403.  In this exemplary embodiment, the user reaches out over the table and performs an operation.  Therefore, when an object area is detected above the table, the object is more likely to be on the edge of the table.  Therefore, if the object area extracted from the first area is an area in which an operation object is captured, the portion that belongs to the boundary detected in step S902 is in the object area.  In other words, the boundary portion detected in step S902 has at least a portion of the object area extracted from the first area.  On the other hand, if a party in contact with the boundary between the first and second zones is not contained in the extracted object zone, it is assumed that the object is not an object of operation such as as the user's hand, but any object is placed on the table.  In this case, it is not necessary to follow the extracted object area and recognize a gesture operation.  By way of example, the processing according to the flowchart then ends, or the process returns to the process shown in step S402.  The border acquisition unit 203 stores the information acquired on the boundary portion in the RAM 103.  In step S404, an area of predetermined size, or a straight line (tangent to the border area) of predetermined length including the area that belongs to the boundary between the first and second areas in the object area of the first area extracted in step S403, can be acquired.  In step S405, the extraction unit 205 extracts from the input image, in the zone (second zone) having no table, an area comprising an object located at a higher position. as the table as an object area.  In this exemplary embodiment, in the area having no table in the background, the object area is detected by performing processing using a height indicated by the pixel values of the input image. as reference height.  Specifically, the extraction is performed based on the threshold value processing using the height threshold value (second height threshold value) determined in step S401.  The second height threshold value corresponds to a height of a virtual table surface when it is assumed that a table top surface extends to the non-table area.  Specifically, an area in which heights indicated by pixel values of the input image are higher than the virtual table area indicated by the second height threshold value is retrieved as an object area.  However, in this exemplary embodiment, if detection errors are taken into account in the table edge position information and the table height information, the threshold value is corrected, and the value processing threshold is then performed as a function of a distance on a plane xy with respect to the boundary between the first and second zones.  The processing shown in step S405 is described below in detail.  In step S406, the recognition unit 206 determines whether the object captured in the area is a hand of the user relative to an entire combined object area of the object areas extracted from the object. step S403 and step S405.  If the object used as an operation object is not the user's hand but a predetermined object such as a stylus, the recognition unit 206 determines whether the combined object area is the same. object of operation.  In this exemplary embodiment, when the shape (the size, the degree of circularity, the aspect ratio, and the degree of irregularity) and the color information of the combined object area meet predetermined conditions, it is determined that the area is a user's hand.  These conditions include, for example, a high probability of concordant result between the shape of the object area and a shape model provided in advance, and whether the object area is within a rectangle having a predetermined dimension ratio.  In addition, by using a visible light image obtained by capturing the same range of distances as the input image, it can be determined whether an average of red-blue (RGB) values of the pixels in the object area is lies in a range of distances defined as "the color of human skin".  If it is determined that the object is a user's hand (YES at step S406), processing proceeds to step S407.  If it is determined that the object is not a user's hand (NO at step S406), processing proceeds to step S410.  In step S407, the recognition unit 206 determines whether the user's hand is oriented in a pose pointing in one direction based on the shape of the combined area.  In this exemplary embodiment, a matching with a model of a hand in a pose pointing in a direction, which is provided in advance, is performed, and the determination is made on the basis of probability.  If it is determined that the object is in a pose pointing in one direction (YES at step S407), processing proceeds to step S408.  If it is determined that the object is not in a pose pointing in one direction (NO at step S407), processing proceeds to step S409.  In step S408, the recognition unit 206 recognizes a touch operation performed by the hand in the pose pointing in one direction.  Specifically, the recognition unit 206 acquires the height information from a position corresponding to the end of a finger in the object area, and determines whether the top surface of the table is touched by the end. finger on the basis of a difference between the height of the end of the finger and the height of the table.  When the upper surface is touched, a treatment corresponding to the object displayed at the affected position is executed.  By way of example, the object displayed at the touched position is moved to a selected state, or a command associated with the object is executed.  The display control unit 207 generates a display image indicating a response to the recognized operation, and outputs the image to the display device 107.  If the top surface of the table is not touched for a predetermined waiting time, the recognition of a touch operation ceases and the processing proceeds to the next step.  In step S409, the recognition unit 206 stores the position information of the object area in the RAM 103, and recognizes the gesture operation performed by the user's hand on the base of the history for a predetermined period of time.  In this exemplary embodiment, the recognition unit 206 may recognize at least one gesture operation (drag operation) of the user's hand making a sign in a certain direction within the operation area. .  In order to recognize the drag operation, in step S409, the recognition unit 206 specifies the position information of the center of gravity in the object area.  The recognition unit 206 then stores the information in the RAM 103 or other device, performs a matching of the center of gravity position up to this time with a Hidden Markov Model (HMM) using the amplitude of the displacement in as a characteristic quantity, and performs recognition on the basis of probability.  The display control unit 207 generates a display image indicating a response to the recognized slip operation, and outputs the image to the display device 107.  [0055] Gesture recognition for a touch operation and a drag operation during processing from step S406 to step S409 in this exemplary embodiment is not limited to this sequence.
[0003] By way of example, regardless of the outcome of the determination of whether the object is a hand or not (step S406) and the result of the determination of whether the hand is in a pose pointing in a direction or not (step S407), recognizing a touch operation (step S408) and recognizing a drag operation (step S409) can be performed. In addition, using information indicating the height of the hand or the speed of movement, the other gesture operations can be recognized. If the processing performed in step S409 ends, or if it is determined and recognized, in step S406, that the object captured in the object area is not a hand of the In step S410, the CPU 101 then determines whether the input of the telemetry image from the telemetry image sensor 105 is complete. If the input supply of the image is complete (YES at step S410), the gesture recognition process terminates. If the input supply of the image continues, the process returns to step S402, and the gesture recognition process is performed on a new input image. [Processing for extracting an object area in the first area] [0057] The process to be performed in step S403 for extracting from the input image, in the first area containing the table, a an area with an object at a higher position than the table as an object area is described in detail. Referring to FIGS. 4A-4C, a process for detecting a hand of a user stretched over the edge of the table is described in association with the problems that may occur during the processing performed in accordance with FIG. a known technique. Figs. 4A to 4C illustrate the periphery of table top 301 seen from a cross section parallel to a plane yz in the system shown in Fig. 2. In Figs. 4A to 4C, a hand 501 is a hand of a user. pointing to the table top 301 in a pose pointing in one direction (state in which only the index of a hand is stretched). In the known art, if the fact that the user's hand is stretched over the edge of the table is not taken into account, a part whose height information indicates a height greater than the upper surface of the table top 301 can be detected from the input image as a hand area regardless of whether it is inside or outside the edge of the table. Figure 4A illustrates an ideal state. A height 502 is a height of the table top measured by the telemetry image sensor 105. The entire user's hand 501 is at a position higher than the height 502, so that the hand can be accurately detected. However, during the actual detection, because of the errors affecting the distance measurement according to the performance of the telemetry image sensor 105 and the errors affecting the coordinate transformation parameters acquired by calibration, the results of the conversion of the information acquired. from a telemetry image to coordinate information include errors. As a result, it is difficult to accurately detect the height of the table top and separate an upper portion from a lower portion, the table top serving as a boundary. By way of example, FIG. 4B illustrates a detected height of the table top that is higher than the true height of the table top. A height 504 is a height of the table top measured by the telemetry image sensor 105. In this case, it is determined that a portion of the user's hand 501 is at a position less than the height 504, and a loss 505 is produced in the detected area. Figure 4C illustrates a detected height of the table top lower than the true height of the table top. A height 506 is a height of the table top measured by the telemetry image sensor 105. In this case, a portion 507 of the table is at a position higher than the height 506 and this portion is extracted as object area, so it is difficult to distinguish the object area from the real object area. To overcome these problems, in this exemplary embodiment, the pixels of the input image are separated in the first area having the table in the background and the second area having no table. . In the first area, based on a threshold value processing having a relative height relative to a height of the table surface, an object area above the table is detected. Since errors affecting distance measurement or coordinate transformation are contained in both the object area and the table top, the effects due to errors can be reduced by acquiring a relative height. Referring to the flowchart shown in Figure 5, a specific sequence of the processing to be performed in step S403 is described. In step S601, the pitch acquisition unit 202 selects a target pixel contained in the first area in the input image. In this exemplary embodiment, since the position of an edge of the table can be acquired using the basic background image acquired in advance, a target pixel is selected using the inside the range of distances, at a place where the edge is considered to be the boundary, as the first zone. In this exemplary embodiment, the target pixel is selected one by one. Alternatively, a block having a plurality of pixels can be selected one by one and the average value of the pixel values can be used as block height information, so that the next processing can be performed. In step S602, the pitch acquisition unit 202 acquires a height of the table top at a position (position indicated by xy coordinates) corresponding to the target pixel. In this exemplary embodiment, the pitch acquisition unit 202 acquires information indicating the height of the table top at a position corresponding to the target pixel from the information of the telemetry image captured in advance. as a basic background image. The height of the table can be acquired using methods other than the method of using the telemetry image captured in advance as a background background image. By way of example, based on a difference from an input image of a previous frame, it is determined whether there is a movement in the table. If no movement occurs, it is determined that no object is in the table, and a table height can be acquired again in the input image. This method is effective in a case where the last state of the table differs from the state when the base background image is captured. In step S603, the pitch acquisition unit 202 acquires a relative height with respect to the table top corresponding to the height indicated by the pixel value of the target pixel. In this exemplary embodiment, the relative height is obtained by subtracting the value of the target pixel from the acquired value of the table height. In step S604, the extraction unit 205 determines whether the acquired relative height is greater than or equal to a first height threshold value. The first height threshold value is set in advance as a value indicating an appropriate distance to distinguish the table from a user's hand. If it is determined that the acquired relative height is greater than or equal to the first height threshold value (YES in step S604), processing proceeds to step S605. If it is determined that the acquired relative height is not greater than or equal to the first height threshold value (NO at step S604), processing proceeds to step S606. In step S605, the target pixel is specified as part of the object area. In step S606, the pitch acquisition unit 202 determines whether all the pixels in the first area have been processed. If it is determined that all the pixels in the first area have been processed (YES in step S606), the process returns to the main sequence, and the processing performed in step S404 and subsequent steps is performed . If it is determined that all the pixels in the first area have not been processed (NO in step S606), the process returns to step S601. Then, a pixel that has not been processed is selected, and the processing is repeated. In the flowchart shown in Fig. 5, a target pixel is selected from the pixels in the first area defined in advance, and processing is performed on all pixels in the first area. Alternatively, a process for scanning all the pixels in the input image, and determining if the pixels are contained in the first area, may further be performed. In this exemplary embodiment, the first height threshold value to be used during the extraction processing for the first zone is a threshold value of a height to be compared to a relative height relative to to the table top, and it is hypothesized that a predetermined value should be used. Threshold determining unit 204 may also perform correction processing on the first height threshold value based on the input image to obtain a threshold value appropriate for the environment. The processing to be performed in step S405 for extracting an area having an object at a higher position than the table as the object area in the second area in which the table is not contained in the telemetry image, is described in detail. Referring to FIGS. 6A-6C, examples of conditions that may occur when detecting a user's hand above the edge of the table separately in two areas, one of which is one has the table and the other has no table, are described. Figs. 6A to 6C illustrate the periphery of the table top 301 seen from a cross section parallel to a plane yz in the system shown in Fig. 2. As for Figs. 4A to 4C, the hand 501 is a hand of the user pointing to the table top 301 in a pose pointing in a direction (state in which only the index of a hand is stretched). As described above, in the first area above the table, noise and loss occurring upon detection of an object area can be reduced by using the value processing. threshold of a relative height relative to the table. Fig. 6A illustrates a state in which the boundary of the first zone and the second zone is accurately detected. A boundary 701 corresponds to the edge of the table detected in step S902 based on the background background image, and the boundary of the first zone and the second zone. Height 702 is a threshold value (second height threshold) of a height in the second zone, defined in step S401. When detecting the border of the first zone and the second zone based on the base background image, errors may sometimes occur. By way of example, because of an error affecting the detection, at a position corresponding to a pixel considered to be outside the edge of the table, the table could in fact exist. Figure 6B illustrates a case in which a detected height of the table top is less than a true height of the table in a state in which the edge of the table is detected to be at a position closer to the inside of the table. table than a real position. In this case, since a portion of the table in the second area is at a position higher than the height 702, the portion is detected as an object area. Therefore, this part is considered a noise 801. On the yz plane, the noise part is at a position immediately below the real hand, and a y coordinate detected as an object area appears to be the same as an ideal state in Figure 6A. However, the noise 801 has the same dimension as the width of the table top 301 in the direction of the x axis, so that the shape of the object area with the noise 801 is not recognized as the hand of a user. This causes a decrease in the recognition accuracy of the operations. On the other hand, Fig. 6C illustrates a case in which a detected height of the table top is higher than a true height of the table in a state in which the edge of the table is detected as being at a position closer to the table. inside the table than the actual position. In this case, the lower part at the height 702 in the hand 501 is not detected as an object area. Therefore, a loss 802 is produced.
[0004] Depending on the size and shape of the loss 802, the object area detected in the first area and the object area detected in the second area become discontinuous. This can cause a decrease in the recognition accuracy of an operation performed by the hand. In this exemplary embodiment, based on information about the boundary portion between the areas in the object area having been retrieved by the processing performed on the first area, the size of the second threshold value of height is changed or its range of application distances is limited to reduce the occurrence of noise and loss occurring in the object area. [0069] Referring to the flowchart shown in Fig. 7, a process sequence for detecting an object area in the second area to be performed in step S405 is described in detail. In step S801, the pitch acquisition unit 202 selects a target pixel contained in the second area in the input image. In this exemplary embodiment, the target pixel is selected one by one. Alternatively, a block having a plurality of pixels may be selected one by one and the average value of the pixel values may be used as block height information, so that the next processing can be performed. In step S802, the threshold determining unit 204 acquires a position (coordinates xy) corresponding to the target pixel, and a distance of a portion of the boundary between the first and second areas including the extracted object area in the first zone, the portion being detected in step S404. In this exemplary embodiment, a shortest distance from the coordinate y is calculated. In step S803, the threshold determining unit 204 corrects the height threshold value (second height threshold value) determined in step S401 according to the distance acquired in step S802. The processing is performed to prevent the first zone and the second zone from becoming discontinuous, and to prevent the table from being detected at a position where the user's hand can not be located, thereby causing a noise. In an area in which the distance acquired in step S802 is small, i.e. in a zone corresponding to the periphery of the boundary, the height threshold value is reduced in order to prevent the zone from becoming discontinuous. On the other hand, in an area in which the distance acquired in step S802 is large, i.e., a zone away from the user's hand, the height threshold value is increased to prevent the table to be contained in the object area as noise. The processing performed in step S803 is described below in detail. In step S804, the pitch acquisition unit 804 acquires a height indicated by the pixel value of the target pixel. In step S805, the extraction unit 205 determines whether the acquired height is greater than or equal to the second height threshold value. The second threshold value is the value corrected in step S803. If it is determined that the height is greater than or equal to the second height threshold value (YES in step S805), processing proceeds to step S806. If it is determined that the acquired height is not greater than or equal to the second height threshold value (NO at step S805), processing proceeds to step S807. In step S806, the extraction unit 205 specifies the target pixel as part of the object area. In step S807, it is determined whether all the pixels in the second area have been processed. If it is determined that all the pixels in the second area have been processed (YES in step S807), the process returns to the main sequence, and the processing performed in step S404 and subsequent steps is performed . If it is determined that all the pixels in the second zone have not been processed (NO in step S807), the processing returns to step S801. A pixel that has not been processed is selected as the target pixel, and the process is repeated. In the flowchart shown in Fig. 7, a target pixel is selected from the pixels in the second area defined in advance, and processing is performed on all pixels in the second area. Alternatively, a process for scanning all the pixels in the input image, and determining if the pixels are contained in the second area, may further be performed. In this exemplary embodiment, the processing is performed in a space in which the table is installed so that it is parallel to the ground constituting the background. Therefore, in the second area, a pixel having a height indicated by a pixel value higher than the second height threshold value is specified as the object area. However, in a case where the table top contained in the background is not parallel to the ground, a pixel whose position information (position information in a direction intersecting with the table top) is indicated by the pixel value indicating a position on the side that is closer to the foreground (the side closer to the telemetry image sensor 105) than the surface of the table top is specified as the object area . Referring to the flowchart shown in FIG. 8B and the conceptual diagrams of FIGS. 9 to 12, the process for correcting the second height threshold value to be performed in step SS03 is described. The flowchart shown in FIG. 8B illustrates a detailed process to be performed in step S803. In step S1001, the CPU 101 determines whether the distance acquired in step S802 is greater than a threshold value L. If the distance L is less than L (YES in step S1001), in step S1002, a distance proportional value H * (distance / L) is added to the second height threshold value. On the other hand, if the distance 30 is greater than or equal to L (NO in step S1001), in step S1003, a value H is added to the second height threshold value. As a result, the second height threshold value at the periphery of the boundary between the two zones becomes smaller than the second height threshold value in the range of distances remote from the periphery. Referring to Fig. 9, the threshold value L and the value H are described. The threshold value L is set so that the threshold value L does not exceed a horizontal distance 1101 of the end of a finger at the wrist when touching the table with the end of the finger, while the value H is set so that the value H does not exceed a height 1102 of the wrist at the time of tactile contact. If the threshold value L and the value H are set so that they exceed the limitations, the second height threshold value exceeds the heights of the finger and the wrist at the time of the tactile contact. This can cause a loss in the extracted area. In this exemplary embodiment, as an example according to the limitations described above, the values are defined as follows: threshold value L = 80 mm, and value H = 30 mm. Referring to Figures 10A-12B, a specific example of correction of the second threshold value is described. FIGS. 10A to 12B illustrate, as in FIGS. 6B and 6C, states in which the edge of the table is detected so that the edge is at a position which is closer to the inside of the table than the real position. Figs. 10A, 11A, and 12A illustrate cross-sectional views from a yz plane. Figures 10B, 11B, and 12B illustrate images of the table top 301 viewed from above along the z axis. The images are captured by the telemetry image sensor 105, and correspond to input images to be inputted to the information processing apparatus 100. In this exemplary embodiment, an image of actual entry is a telemetry image. The drawings show that the inside of the boundary 701 is the first zone, and that the outside is the second zone, and that the first zone is smaller than the actual surface of the tabletop 301. In this example Embodiment, when a telemetry image corresponding to an image 1201 shown in Fig. 10B is inputted, a portion of the user's finger is extracted as an object area in the first area of the image. within the boundary 701, and at step S404, a portion of the boundary 701 including the extracted area is acquired. In this example, it is assumed that a tangent 1202 is acquired. FIGS. 11A and 11B illustrate that, in a range of distances 1203 from the periphery of the boundary, the second height threshold value is corrected to a small value according to the distance from the tangent 1202. A broken line 1204 indicates the second corrected threshold value. Fig. 12B illustrates, in the second area outside the boundary 701 in the image 1201, a portion extracted as a black object area. At the periphery of the boundary, as the height threshold value is corrected to a low value, no loss occurs in the object area. Therefore, even if a part of the table is placed at a position lower than the threshold value indicated by the broken line 1204, the distance to be detected as the object area is limited to an area very close to the area. object has been extracted in the first zone. As a result, the shape of the combined object area is not greatly distorted, and the occurrence of noise is reduced. The processing for correcting the height threshold value in step S803 may be omitted in a state illustrated in Fig. 6A, wherein the boundary between the first zone and the second zone is accurately acquired. By way of example, when the user directly supplies position information indicating the boundary directly to the input, or when the edge portion of the table is accurately detected according to the edge detection processing carried out in an image in visible light, it is possible that the probability that the table top is part of the second area is small. In the case where the correction can be omitted, in the first area, a method of using a relative height relative to the table top as a reference height can be used, and in the second area, a method of using a reference height based on the detection of the edge of the table can be used. This allows accurate detection of a user's hand above the edge of the table. In the first exemplary embodiment, according to the distance with respect to the border corresponding to the edge of the table having the object area extracted in the zone comprising the table, the second height threshold value is corrected. Alternatively, the processing performed in step S404 may be omitted, and in step S802, a shorter distance between the boundary line of the zones and the target pixel may be acquired. The threshold value of the height corrected by the correction processing of the second height threshold value according to the first exemplary embodiment is defined, in the second zone, as a higher value than the plateau of real table, so it can become difficult to detect an object lying above the table top. In a case where an object that does not extend above the boundary between the first and second areas is to be detected, the processing can be performed separately using the second height threshold value that has not yet been corrected. Therefore, in order to detect an object lying outside the table, this modification may alternatively be used or may be used in conjunction with the first exemplary embodiment to increase the extraction performance. In the first exemplary embodiment, at step S406, it is recognized whether the extracted area is a user's hand based on the shape (size, degree of circularity, aspect ratio, and degree). irregularity) and color information of the extracted object area to reduce the possibility that the operation is incorrectly recognized due to an object other than the operation object. Alternatively, based on an input quantity in the table, the object can be recognized. For example, it is hypothesized that an area of an object area extracted in the first area is an input quantity, and the entire area of an object area extracted. in the first and second areas is a size. A threshold value determination is then performed to determine whether the input magnitude and the size are within the appropriate range ranges, respectively. This eliminates a non-natural looking object area. In addition, the first exemplary embodiment can be applied, instead of the tabletop interface, to a system for extracting an area having an object in the foreground of a table top which is a background from a telemetry image obtained by capturing a space comprising the table top. A second exemplary embodiment having, in addition to the first exemplary embodiment, a configuration for reducing a generated gap as a result of the threshold value processing performed in each of the first area and the second zone, will now be described. Since an aspect of a system information processing and a hardware configuration of the second exemplary embodiment are similar to those of the first exemplary embodiment described with reference to Fig. 1 and Fig. 2 the same numeric references are used and detailed descriptions are omitted. However, in a functional configuration according to the second exemplary embodiment, the threshold determining unit 204 of the second exemplary embodiment has a plurality of candidate values for the second height threshold value, and selects a value of threshold having a small deviation from an extraction result of an object area in the first area. Referring to the flowchart shown in Fig. 13, gesture recognition processing of the second exemplary embodiment is described. This processing is carried out by the CPU 101, which constitutes the functional units in the information processing apparatus, by loading a program stored in the ROM 102 to the RAM 103 and executing the program. This processing begins in response to inputting a telemetry image captured by the telemetry image sensor 105 into the image acquisition unit 201.
[0005] Figure 15 illustrates examples of modification of extracted object areas at a stage where the processing performed at each step of the flowchart shown in Figure 13 is performed. In FIG. 13, in the steps having the same reference numbers as those of the flowchart shown in FIG. 3, a processing similar to that shown in FIG. 3 is carried out, so that the detailed descriptions are omitted and the differences from Figure 3 will be described. In the second exemplary embodiment, in step S403, an object in the first area is retrieved, and processing proceeds to step S1301. In step S1301, the threshold determining unit 204 determines whether a second height threshold value has been determined. Specifically, the threshold determination unit 204 makes its determination based on whether the information of a second determined threshold height value is stored in the RAM 103. If a second threshold value of height present has not been determined, i.e., has not been decided (NO at step S1301), at step S1302 a second threshold value of height is determined. Referring to the flowchart shown in Fig. 14 and Fig. 15, a sequence of the determination process of the second height threshold value (S1302) to be performed by the threshold determination unit 204, which is a functional unit in the CPU 101, is described. In step S1401, the threshold determining unit 204 determines N candidate values In (n = 0, ... N-1) of the second height threshold value. A plurality of candidate values In are provided using an expected table top height as a reference, and with a range of distances having upper and lower values. By way of example, by performing a processing similar to that shown in step S401, a height to be a reference height is calculated, and a plurality of values lying within a predetermined range of distances including upper and lower values are determined. Alternatively, by way of example, candidate values determined in advance on the basis of a fixed positional relationship between the telemetry image sensor 105 and the table top 301 can be read from the storage device 104. or other and be determined. With respect to respective candidate values In, processing from step S1402 to step S1404 is then performed repeatedly. In step S1402, pixels within the table in the image are scanned, and a region of height of In or higher is extracted. In the example illustrated in FIG. 15, a zone 1502 is extracted. In step S1403, a difference from the object area in the first area extracted in step S404 is acquired. In this exemplary embodiment, an exclusive OR operation is performed. In the case of FIG. 15, a difference 1503 between an object zone 1501 extracted in the first zone and a zone 1502 extracted in step S1402 is acquired. In step S1404, CPU 101 acquires the number of pixels. The treatment described above is carried out with respect to each In (n = 0, ..., N-1). In step S1405, the threshold determining unit 204 determines the value In which has the least number of difference pixels as the second height threshold values, and stores the information in the RAM 103. In the figure 15, the height of the threshold values decreases as the processing from step S1402 to S1404 is repeated, and at the stage where 12 is used, the number of acquired difference pixels becomes the lowest. In accordance with the second threshold value determination processing described above, a threshold value having a small difference in extraction references between the inside and the outside of the table can be acquired.
[0006] In response to determining the second height threshold value, the processing returns to the flowchart shown in Fig. 13, and the processing performed in step S406 and subsequent steps is performed. In the second exemplary embodiment, after the processing for recognizing a touch operation has been performed in step S408, the processing proceeds to step S1303, and the threshold determining unit 204 determines whether a touch operation has been performed. If a touch operation has been performed (YES at step S1303), the processing proceeds to step S1304. Moreover, during a predetermined waiting time during the processing carried out in step S408, if no touch operation is performed on the table top with a finger end of the user, the threshold determining unit 204 determines that no touch operation has been performed (NO at step S1303), and processing proceeds to step S409. In step S1304, the threshold determining unit 204 again determines the second height threshold value. The second threshold value determination process performed in step S1304, such as in step S1302, is performed according to the flowchart shown in Fig. 14. When the user performs the touch operation, the user's hand is at a height in contact with the table top. Therefore, when the touch operation is performed, the processing for determining the height threshold value corresponding to the height of the table is performed again, this making it possible to calculate a height threshold value allowing further recognition. more accurate of the table and one hand. It is not always necessary to re-determine the threshold value each time a touch operation is performed, and the execution times of the processing may be limited. By way of example, the threshold value is again determined when a first touch operation is performed after a start operation of the information processing apparatus 100. As described above, the execution times of the processing are limited to a preferential use of processing capacity of the CPU 101 for a response to a touch operation. The second exemplary embodiment can be applied, instead of the tabletop interface, to a system for extracting an area having an object in the foreground of a table top. which is a background of a telemetry image obtained by capturing a space containing the table top. According to the exemplary embodiments of the present invention, in a space having a plurality of areas having different background conditions, accurate object recognition can be performed. Embodiments of the present invention may also be implemented by a computer of a system or apparatus that reads and executes executable instructions on a computer stored on a storage medium (e.g. nonvolatile computer-readable storage device) in order to implement the functions of one or more of the embodiment (s) described above of the present invention, and by a method implemented by the computer of the system or apparatus, for example, by reading and executing computer-executable instructions from the storage medium to implement the functions of one or more of the embodiment (s) described above. The computer may include one or more of a central processing unit (CPU), a microprocessor unit (MPU), or other circuits, and may include a network of separate computers or computer processors separated. Computer-executable instructions may be provided to the computer, for example from a network or storage medium. The storage medium may for example comprise one or more of a hard disk, a random access memory (RAM), a read only memory (ROM), a distributed computer system storage unit, a disk optical (such as a compact disc (CD), a digital versatile disc (DVD), or a Blu-Ray disc (BO) m), a flash memory device, a memory card, etc. While the present invention has been described with reference to exemplary embodiments, it should be noted that the invention is not limited to the exemplary embodiments disclosed.
权利要求:
Claims (16)
[0001]
REVENDICATIONS1. An information processing apparatus comprising: an image acquisition unit configured to acquire an input image and, for each pixel of the input image, position information in a space having a predetermined surface that is part of a background; and an extraction unit configured to extract from the acquired input image, in an area corresponding to a portion of the space having the area in the background, an object area corresponding to an object lying in the foreground to the surface, based on position information of the object in the acquired input image, position information of the surface, and a first reference, and to extract from the acquired input image, in an area corresponding to a portion of the space not having the area in the background, an object area corresponding to the object in the area foreground to the surface, based on the position information of the object in the acquired input image, and a second reference different from the first reference.
[0002]
An information processing apparatus according to claim 1, wherein the first reference is a reference in relation to a distance of a portion from the surface, acquired from the position information of that portion in the image. input acquired, and position information about the surface.
[0003]
An information processing apparatus according to claim 2, wherein in the area corresponding to that part of the space having the area in the background, the extraction unit extracts, as a d object, from the acquired input image, a part in which the distance from the surface, acquired from the position information of that part in the acquired input image, is greater than the first reference.
[0004]
An information processing apparatus according to claim 1, wherein the second reference is a reference in relation to positional information along an axis in a direction intersecting with the surface, and wherein in the zone corresponding to the part of the space not having the surface in the background, the extraction unit extracted as an object area from the acquired input image, a part having information of position along the axis in a direction intersecting with the surface indicating a position which is closer to the foreground side than the surface.
[0005]
An information processing apparatus according to claim 1, further comprising a recognition unit configured to recognize an operation input in the information processing apparatus based on all of a form of a combined object area obtained by combining the object areas extracted by the extraction unit.
[0006]
An information processing apparatus according to claim 5, wherein, if it is determined that the object is a hand of a user in a pose pointing in a direction based on the shape of the area of combined object, the recognition unit recognizes a touch contact on the surface with the fingertip of the user based on position information of the end of a finger of the user acquired from the input image, and if it is determined that the object in the object area is not a user's hand in a pose pointing in one direction, the recognition unit does not recognize touch contact on the surface by the object.
[0007]
A method of controlling an information processing apparatus, the method comprising: acquiring an input image and, for each pixel of the input image, position information in a space having a predetermined surface part of a background; extracting from the acquired image, in an area corresponding to a portion of the space having the surface in the background, an object area corresponding to an object in the foreground to the surface on the basis of position information of the object in the acquired input image, position information of the surface, and a first reference, and extracting, the acquired input image in an area corresponding to a portion of the space not having the area in the background, an object area corresponding to the object being in the foreground with respect to the surface, on the based on the position information of the object in the acquired input image, and a second reference different from the first reference.
[0008]
A computer readable storage medium storing a computer program for causing a computer to perform the method of claim 7.
[0009]
An information processing apparatus comprising: an acquisition unit configured to acquire a telemetry image in which distance information corresponding to a direction of height is represented by a pixel value in a space having a table; and an extraction unit configured, in a first area having an upper surface of the table in a background, to extract from the telemetry image as an object area corresponding to an object at a position higher than the table in space, a portion having a pixel value indicating that a relative height relative to the table top is higher than a first threshold value, and to extract from the telemetry image in a second area not having the table top in the background as an object area, a portion having a pixel value indicating a height greater than a second threshold value defined on the basis of a height of the upper surface of the table as an object area.
[0010]
An information processing apparatus according to claim 9, further comprising a first determining unit configured to determine as a second threshold value a value corresponding to the height of the upper surface of the acquired table in advance. based on a pixel value of a telemetry image acquired by the acquisition unit.
[0011]
An information processing apparatus according to claim 10, wherein the first determining unit corrects the second threshold value by a correction amount corresponding to a distance of a portion contained in a boundary between the first zone and the second zone. second zone, in the object zone extracted from the first zone by the extraction unit.
[0012]
An information processing apparatus according to claim 10, wherein the first determining unit corrects the second threshold value to a value corresponding to a height less than the height of the upper surface of the table at a position contained in the periphery of the portion contained in the boundary between the first zone and the second zone, and corrects the second threshold value to a value corresponding to a height greater than the height of the upper surface of the table at a position not contained in the periphery of the portion contained in the boundary between the first zone and the second zone, in the object zone extracted from the first zone by the extraction unit.
[0013]
An information processing apparatus according to claim 10, wherein the first determining unit determines a plurality of candidates for the second threshold value based on the height of the top surface of the acquired table on the basis of a pixel value of a telemetry image acquired in advance by the acquisition unit, and determines a value to be used for extracting an object area in the second area from the plurality of candidates based on the size of an object area extracted from the first area.
[0014]
An information processing apparatus according to claim 9, further comprising a second determining unit configured to determine a value of the second threshold value, wherein the second determining unit determines, at the moment a touch contact of the object in the object area on the table top is recognized, a value of the second threshold value again based on height information acquired from a part of the area of object extracted by the extraction unit. 20
[0015]
A method of controlling an information processing apparatus, the method comprising: acquiring a telemetry image in which distance information corresponding to a direction of height is represented by a pixel value in a space comprising a table ; extracting, in a first area having an upper surface of the table in a background, as an object area corresponding to an object at a higher position than the table in space, a portion having a a pixel value indicating that a relative height relative to the table top is greater than a first threshold value, and extracting from the acquired telemetry image, in a second area not including the table top in 35 l. background, as an object area, a part having a pixel value indicating a height greater than a second threshold value defined based on a height of the upper surface of the table.
[0016]
16. A computer readable storage medium storing a computer program for causing a computer to perform the method of claim 15.
类似技术:
公开号 | 公开日 | 专利标题
US10324563B2|2019-06-18|Identifying a target touch region of a touch-sensitive surface based on an image
FR3023629A1|2016-01-15|INFORMATION PROCESSING APPARATUS FOR DETECTING OBJECT FROM IMAGE, METHOD FOR CONTROLLING APPARATUS, AND STORAGE MEDIUM
US10456918B2|2019-10-29|Information processing apparatus, information processing method, and program
FR3023632A1|2016-01-15|
US10156937B2|2018-12-18|Determining a segmentation boundary based on images representing an object
EP3101624A1|2016-12-07|Image processing method and image processing device
EP2455916A1|2012-05-23|Non-rigid tracking-based human-machine interface
US20180197047A1|2018-07-12|Stereoscopic object detection leveraging expected object distance
BE1022630A1|2016-06-21|Method and system for determining candidate vanishing points for projective correction
US9733764B2|2017-08-15|Tracking of objects using pre-touch localization on a reflective surface
US10664090B2|2020-05-26|Touch region projection onto touch-sensitive surface
US10481733B2|2019-11-19|Transforming received touch input
US9823782B2|2017-11-21|Pre-touch localization on a reflective surface
US10606468B2|2020-03-31|Dynamic image compensation for pre-touch localization on a reflective surface
US9232132B1|2016-01-05|Light field image processing
EP3191918B1|2020-03-18|Developing contextual information from an image
FR3048108B1|2019-11-01|METHOD FOR RECOGNIZING THE DISPOSITION OF A HAND IN AN IMAGE STREAM
BE1023596B1|2017-05-11|INTERACTIVE SYSTEM BASED ON MULTIMODAL GESTURES AND METHOD USING SINGLE DETECTION SYSTEM
同族专利:
公开号 | 公开日
DE102015110955A1|2016-01-14|
GB2530150A|2016-03-16|
JP2016018458A|2016-02-01|
US20160011670A1|2016-01-14|
GB201512040D0|2015-08-19|
FR3023629B1|2020-02-07|
US10042426B2|2018-08-07|
GB2530150B|2017-01-18|
JP6335695B2|2018-05-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
EP1879130A2|2006-07-13|2008-01-16|Northrop Grumman Corporation|Gesture recognition interface system|
US20110316767A1|2010-06-28|2011-12-29|Daniel Avrahami|System for portable tangible interaction|
FR2970797A1|2011-01-25|2012-07-27|Intui Sense|TOUCH AND GESTURE CONTROL DEVICE AND METHOD FOR INTERPRETATION OF THE ASSOCIATED GESTURE|
JP3834766B2|2000-04-03|2006-10-18|独立行政法人科学技術振興機構|Man machine interface system|
JP4389602B2|2004-02-24|2009-12-24|パナソニック電工株式会社|Object detection apparatus, object detection method, and program|
EP2458554B1|2005-01-21|2015-04-29|Qualcomm Incorporated|Motion-based tracking|
JP2007064894A|2005-09-01|2007-03-15|Fujitsu Ten Ltd|Object detector, object detecting method, and object detection program|
US9696808B2|2006-07-13|2017-07-04|Northrop Grumman Systems Corporation|Hand-gesture recognition method|
US8180114B2|2006-07-13|2012-05-15|Northrop Grumman Systems Corporation|Gesture recognition interface system with vertical display|
EP2584494A3|2006-08-03|2015-02-11|Alterface S.A.|Method and devicefor identifying and extractingimages of multiple users, and for recognizing user gestures|
GB2453675B|2006-10-10|2009-09-23|Promethean Ltd|Pointing device specific applications/areas for interactive surface|
CN101303634B|2007-05-09|2012-05-23|鸿富锦精密工业(深圳)有限公司|Portable electronic apparatus|
US9377874B2|2007-11-02|2016-06-28|Northrop Grumman Systems Corporation|Gesture recognition light and video image projector|
US8555207B2|2008-02-27|2013-10-08|Qualcomm Incorporated|Enhanced input using recognized gestures|
US8624962B2|2009-02-02|2014-01-07|Ydreams—Informatica, S.A. Ydreams|Systems and methods for simulating three-dimensional virtual interactions from two-dimensional camera images|
CN101872270B|2009-04-25|2013-09-18|鸿富锦精密工业(深圳)有限公司|Touch control device|
US9498718B2|2009-05-01|2016-11-22|Microsoft Technology Licensing, Llc|Altering a view perspective within a display environment|
US20100315413A1|2009-06-16|2010-12-16|Microsoft Corporation|Surface Computer User Interaction|
BRPI1010041A2|2009-06-25|2016-09-20|Koninkl Philips Electronics Nv|system for analyzing an object, apparatus for analyzing an object, apparatus for assisting analysis of an object and method for analyzing an object|
JP5201096B2|2009-07-17|2013-06-05|大日本印刷株式会社|Interactive operation device|
US20130135199A1|2010-08-10|2013-05-30|Pointgrab Ltd|System and method for user interaction with projected content|
JP2013134549A|2011-12-26|2013-07-08|Sharp Corp|Data input device and data input method|
WO2015047225A1|2013-09-24|2015-04-02|Hewlett-Packard Development Company, L.P.|Determining a segmentation boundary based on images representing an object|
JP6326847B2|2014-02-14|2018-05-23|富士通株式会社|Image processing apparatus, image processing method, and image processing program|US9880267B2|2015-09-04|2018-01-30|Microvision, Inc.|Hybrid data acquisition in scanned beam display|
US9858476B1|2016-06-30|2018-01-02|Konica Minolta Laboratory U.S.A., Inc.|Method for recognizing table, flowchart and text in document images|
JP6814053B2|2017-01-19|2021-01-13|株式会社日立エルジーデータストレージ|Object position detector|
DE102018205664A1|2018-04-13|2019-10-17|Bayerische Motoren Werke Aktiengesellschaft|Device for assisting an occupant in the interior of a motor vehicle|
法律状态:
2016-07-28| PLFP| Fee payment|Year of fee payment: 2 |
2017-07-18| PLFP| Fee payment|Year of fee payment: 3 |
2018-03-23| PLSC| Publication of the preliminary search report|Effective date: 20180323 |
2018-07-27| PLFP| Fee payment|Year of fee payment: 4 |
2019-07-29| PLFP| Fee payment|Year of fee payment: 5 |
2020-07-30| PLFP| Fee payment|Year of fee payment: 6 |
2021-06-22| PLFP| Fee payment|Year of fee payment: 7 |
优先权:
申请号 | 申请日 | 专利标题
JP2014141805|2014-07-09|
JP2014141805A|JP6335695B2|2014-07-09|2014-07-09|Information processing apparatus, control method therefor, program, and storage medium|
[返回顶部]